Search CORE

108 research outputs found

Benchmark datasets for biomedical knowledge graphs with negative statements

Author: Pesquita Catia
Silva Sara
Sousa Rita T.
Publication venue
Publication date: 21/07/2023
Field of study

Knowledge graphs represent facts about real-world entities. Most of these facts are defined as positive statements. The negative statements are scarce but highly relevant under the open-world assumption. Furthermore, they have been demonstrated to improve the performance of several applications, namely in the biomedical domain. However, no benchmark dataset supports the evaluation of the methods that consider these negative statements. We present a collection of datasets for three relation prediction tasks - protein-protein interaction prediction, gene-disease association prediction and disease prediction - that aim at circumventing the difficulties in building benchmarks for knowledge graphs with negative statements. These datasets include data from two successful biomedical ontologies, Gene Ontology and Human Phenotype Ontology, enriched with negative statements. We also generate knowledge graph embeddings for each dataset with two popular path-based methods and evaluate the performance in each task. The results show that the negative statements can improve the performance of knowledge graph embeddings

arXiv.org e-Print Archive

Explainable Representations for Relation Prediction in Knowledge Graphs

Author: Pesquita Catia
Silva Sara
Sousa Rita T.
Publication venue
Publication date: 22/06/2023
Field of study

Knowledge graphs represent real-world entities and their relations in a semantically-rich structure supported by ontologies. Exploring this data with machine learning methods often relies on knowledge graph embeddings, which produce latent representations of entities that preserve structural and local graph neighbourhood properties, but sacrifice explainability. However, in tasks such as link or relation prediction, understanding which specific features better explain a relation is crucial to support complex or critical applications. We propose SEEK, a novel approach for explainable representations to support relation prediction in knowledge graphs. It is based on identifying relevant shared semantic aspects (i.e., subgraphs) between entities and learning representations for each subgraph, producing a multi-faceted and explainable representation. We evaluate SEEK on two real-world highly complex relation prediction tasks: protein-protein interaction prediction and gene-disease association prediction. Our extensive analysis using established benchmarks demonstrates that SEEK achieves significantly better performance than standard learning representation methods while identifying both sufficient and necessary explanations based on shared semantic aspects.Comment: 16 pages, 3 figure

arXiv.org e-Print Archive

Ontology Matching Techniques for Enterprise Architecture Models

Author: Catia Pesquita
José Borbinha
Marzieh Bakhshandeh
Publication venue
Publication date: 11/04/2020
Field of study

Abstract. Current Enterprise Architecture (EA) approaches tend to be generic, based on broad meta-models that cross-cut distinct architectural domains. Integrating these models is necessary to an effective EA process, in order to support, for example, benchmarking of business processes or assessing compliance to structured requirements. However, the integration of EA models faces challenges stemming from structural and semantic heterogeneities that could be addressed by ontology matching techniques. For that, we used AgreementMakerLight, an ontology matching system, to evaluate a set of state of the art matching approaches that could adequately address some of the heterogeneity issues. We assessed the matching of EA models based on the ArchiMate and BPMN languages, which made possible to conclude about not only the potential but also of the limitations of these techniques to properly explore the more complex semantics present in these models. Enterprise Architecture (EA) is a practice to support the analysis, design and implementation of a business strategy in an organization, considering its relevant multiple domains. In recent years, a variety of Enterprise Architecture To support the matching tasks we have used AgreementMakerLight (AML

CiteSeerX

The epidemiology ontology: an ontology for the semantic annotation of epidemiological resources

Author: Catia Pesquita
Francisco M Couto
João D Ferreira
Mário J Silva
Publication venue: Springer Nature
Publication date: 01/01/2014
Field of study

BACKGROUND: Epidemiology is a data-intensive and multi-disciplinary subject, where data integration, curation and sharing are becoming increasingly relevant, given its global context and time constraints. The semantic annotation of epidemiology resources is a cornerstone to effectively support such activities. Although several ontologies cover some of the subdomains of epidemiology, we identified a lack of semantic resources for epidemiology-specific terms. This paper addresses this need by proposing the Epidemiology Ontology (EPO) and by describing its integration with other related ontologies into a semantic enabled platform for sharing epidemiology resources. RESULTS: The EPO follows the OBO Foundry guidelines and uses the Basic Formal Ontology (BFO) as an upper ontology. The first version of EPO models several epidemiology and demography parameters as well as transmission of infection processes, participants and related procedures. It currently has nearly 200 classes and is designed to support the semantic annotation of epidemiology resources and data integration, as well as information retrieval and knowledge discovery activities. CONCLUSIONS: EPO is under active development and is freely available at https://code.google.com/p/epidemiology-ontology/. We believe that the annotation of epidemiology resources with EPO will help researchers to gain a better understanding of global epidemiological events by enhancing data integration and sharing

Springer - Publisher Connector

PubMed Central

Special issue on ontology and linked data matching

Author: Cheatham Michelle
Cruz Isabel
Euzenat Jérôme
Pesquita Catia
Publication venue: 'IOS Press'
Publication date: 01/01/2017
Field of study

cheatham2017bEditorial, Semantic web journal 8(2):183-18

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

DDB-EDM to FaBiO: The Case of the German Digital Library

Author: Dessı̀ Danilo
Oppenländer Jonas
Oshani Seneviratne Juan Sequeda, Lorena Etcheverry, Catia Pesquita
Sack Harald
Tan Mary Ann
Tietz Tabea
Publication venue: RWTH Aachen
Publication date: 06/11/2021
Field of study

Cultural heritage portals have the goal of providing users with seamless access to all their resources. This paper introduces initial efforts for a user-oriented restructuring of the German Digital Library (DDB). At present, cultural heritage objects (CHOs) in the DDB are modeled using an extended version of the Europeana Data Model (DDBEDM), which negatively impacts usability and exploration. These challenges can be addressed by exploiting ontologies, and building a knowledge graph from the DDB’s voluminous collection. Towards this goal, an alignment of bibliographic metadata from DDB-EDM to FRBR-Aligned Bibliographic Ontology (FaBiO) is presented

KITopen

Results of the Ontology Alignment Evaluation Initiative 2015

Author: Cheatham Michelle
Dragisic Zlatan
Euzenat Jérôme
Faria Daniel
Ferrara Alfio
Flouris Giorgos
Fundulaki Irini
Granada Roger
Ivanova Valentina
Jiménez-Ruiz Ernesto
Lambrix Patrick
Montanelli Stefano
Pesquita Catia
Saveta Tzanina
Shvaiko Pavel
Solimando Alessandro
Trojahn dos Santos Cassia
Zamazal Ondrej
Publication venue: No commercial editor.
Publication date: 01/01/2015
Field of study

cheatham2016aInternational audienceOntology matching consists of finding correspondences between semantically related entities of two ontologies. OAEI campaigns aim at comparing ontology matching systems on precisely defined test cases. These test cases can use ontologies of different nature (from simple thesauri to expressive OWL ontologies) and use different modalities, e.g., blind evaluation, open evaluation and consensus. OAEI 2015 offered 8 tracks with 15 test cases followed by 22 participants. Since 2011, the campaign has been using a new evaluation modality which provides more automation to the evaluation. This paper is an overall presentation of the OAEI 2015 campaign

HAL-CentraleSupelec

Scientific Publications of the University of Toulouse II Le Mirail

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Open Archive Toulouse Archive Ouverte

Hal-Diderot

HAL-Rennes 1

Metrics for GO based protein semantic similarity: a systematic evaluation

Author: A Schlicker
A Valencia
André O Falcão
António EN Ferreira
C Pesquita
C Wu
Catia Pesquita
D Devos
D Devos
D Faria
D Lin
Daniel Faria
E Camon
EB Camon
F Azuaje
F Azuaje
F Couto
F Couto
FM Couto
Francisco M Couto
Gentleman
Hugo Bastos
J Chabalier
J Jiang
J Tuikkala
JL Sevilla
L Stein
P Lord
P Lord
P Resnik
PH Lee
RM Othman
RM Riensche
S Cao
T Joshi
X Guo
X Wu
Y Tao
Z Lei
ZH Duan
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Several semantic similarity measures have been applied to gene products annotated with Gene Ontology terms, providing a basis for their functional comparison. However, it is still unclear which is the best approach to semantic similarity in this context, since there is no conclusive evaluation of the various measures. Another issue, is whether electronic annotations should or not be used in semantic similarity calculations. Results We conducted a systematic evaluation of GO-based semantic similarity measures using the relationship with sequence similarity as a means to quantify their performance, and assessed the influence of electronic annotations by testing the measures in the presence and absence of these annotations. We verified that the relationship between semantic and sequence similarity is not linear, but can be well approximated by a rescaled Normal cumulative distribution function. Given that the majority of the semantic similarity measures capture an identical behaviour, but differ in resolution, we used the latter as the main criterion of evaluation. Conclusions This work has provided a basis for the comparison of several semantic similarity measures, and can aid researchers in choosing the most adequate measure for their work. We have found that the hybrid <it>simGIC</it> was the measure with the best overall performance, followed by Resnik's measure using a best-match average combination approach. We have also found that the average and maximum combination approaches are problematic since both are inherently influenced by the number of terms being combined. We suspect that there may be a direct influence of data circularity in the behaviour of the results including electronic annotations, as a result of functional inference from sequence similarity.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Universidade de Lisboa: Repositório.UL

The immunopeptidome from a genomic perspective:Establishing the noncanonical landscape of MHC class I–associated peptides

Author: Alfaro Javier A
Axelson Håkan
Battail Christophe
Bedran Dominika
Bedran Georges
Brennan Paul M
Fahraeus Robin
Gasser Hans-Christof
Goodlett David R
Harrison David J
Hupp Ted R
Kote Sachin
Laird Alexander
Litchfield Kevin
O'Neill J Robert
Palkowski Aleksander
Parys Maciej
Pawlik Maciej
Pesquita Catia
Rajan Ajitha
Symeonides Stefan N
Wang Tongjie
Weke Kenneth
Zanzotto Fabio Massimo
Publication venue: 'American Association for Cancer Research (AACR)'
Publication date: 01/01/2023
Field of study

G.B., D.B., K.W., A.P., R.F., T.R.H., S.K., and J.A.A. received support from Fundacja na rzecz Nauki Polskiej (FNP) (grant ID: MAB/3/2017). D.R.G. received support from Genome Canada & Genome BC (grant ID: 264PRO). D.J.H. received support from NuCana plc (grant ID: SMD0-ZIUN05). H.A. received support from Swedish Cancer Foundation (grant ID: 211709). H.G. received support from United Kingdom Research and Innovation (UKRI) (grant ID: EP/S02431X/1). C.P. received support from Fundação para a Ciência e a Tecnologia (FCT) through LASIGE Research Unit (grant ID: UIDB/00408/2020 and UIDP/00408/2020). A.L. F.M.Z., C.P., A.R., A.P., and J.A.A. received support from European Union’s Horizon 2020 research and innovation programme (grant ID: 101017453). C.B. received support from Agence Nationale de la Recherche (ANR) through GRAL LabEX (grant ID: ANR-10-LABX-49-01) and CBH-EUR-GS 32 (grant ID: ANR-17-EURE0003). S.N.S. received support from Cancer Research UK (CRUK) and the Chief Scientist's Office of Scotland (CSO): Experimental Cancer Medicine Centre (ECMC) (grant ID: ECMCQQR-2022/100017). A.L. received support from Chief Scientist's Office of Scotland (CSO) NRS Career Researcher Fellowship. R.O.N. received support from CRUK Cambridge Centre Thoracic Cancer Programme (grant ID: CTRQQR-2021\100012).Tumor antigens can emerge through multiple mechanisms, including translation of non-coding genomic regions. This non-canonical category of antigens has recently gained attention; however, our understanding of how they recur within and between cancer types is still in its infancy. Therefore, we developed a proteogenomic pipeline based on deep learning de novo mass spectrometry to enable the discovery of non-canonical MHC-associated peptides (ncMAPs) from non-coding regions. Considering that the emergence of tumor antigens can also involve post-translational modifications, we included an open search component in our pipeline. Leveraging the wealth of mass spectrometry-based immunopeptidomics, we analyzed 26 MHC class I immunopeptidomic studies of 9 different cancer types. We validated the de novo identified ncMAPs, along with the most abundant post-translational modifications, using spectral matching and controlled their false discovery rate (FDR) to 1%. Interestingly, the non-canonical presentation appeared to be 5 times enriched for the A03 HLA supertype, with a projected population coverage of 54.85%. Here, we reveal an atlas of 8,601 ncMAPs with varying levels of cancer selectivity and suggest 17 cancer-selective ncMAPs as attractive targets according to a stringent cutoff. In summary, the combination of the open-source pipeline and the atlas of ncMAPs reported herein could facilitate the identification and screening of ncMAPs as targeting agents for T-cell therapies or vaccine development.Publisher PDFPeer reviewe

Lund University Publications

UCL Discovery

Edinburgh Research Explorer

University of St. Andrews - Pure

St Andrews Research Repository

Recommended from our members

Results of the ontology alignment evaluation initiative 2020

Author: Algergawy Alsayed
Amini Reihaneh
Faria Daniel
Fundulaki Irini
Harrow Ian
Hertling Sven
Hitzler Pascal
Jiménez-Ruiz Ernesto
Jonquet Clement
Karam Naouel
Khiat Abderrahmane
Laadhar Amir
Laadhar Amir
Lambrix Patrick
Li Huanyu
Li Ying
Paulheim Heiko
Pesquita Catia
Pour Mina Abd Nikooie
Saveta Tzanina
Shvaiko Pavel
Splendiani Andrea
Thiéblin Élodie
Trojahn Cassia
Vataščinová Jana
Yaman Beyza
Zamazal Ondřej
Zhou Lu
Publication venue: CEUR-WS
Publication date: 01/01/2020
Field of study

The Ontology Alignment Evaluation Initiative (OAEI) aims at comparing ontology matching systems on precisely defined test cases. These test cases can be based on ontologies of different levels of complexity and use different evaluation modalities (e.g., blind evaluation, open evaluation, or consensus). The OAEI 2020 campaign offered 12 tracks with 36 test cases, and was attended by 19 participants. This paper is an overall presentation of that campaign

City Research Online

Scientific Publications of the University of Toulouse II Le Mirail

MAnnheim DOCument Server